Temporal Difference Learning in the Tetris Game

نویسندگان

  • Hans Pirnay
  • Slava Arabagi
چکیده

Learning to play the game Tetris has been a common challenge on a few past machine learning competitions. Most have achieved tremendous success by using a heuristic cost function during the game and optimizing each move based on those heuristics. This approach dates back to 1990’s when it has been shown to eliminate successfully a couple hundred lines. However, we decided to completely depart from defining heuristics for the game and implement a “Know-Nothingism” approach, thereby excluding any human factor and therefore error/bias/pride in teaching the computer to play efficiently. Instead we use a combination of value iteration based on the Temporal Difference (TD) methodology in combination with a Neural Network to have the computer learn to play Tetris with minimal human interaction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement of Local Pattern Cases for Playing Tetris

In the paper, we investigate the use of reinforcement learning in CBR for estimating and managing a legacy case base for playing the game of Tetris. Each case corresponds to a local pattern describing the relative height of a subset of columns where pieces could be placed. We evaluate these patterns through reinforcement learning to determine if significant performance improvement can be observ...

متن کامل

Learning to play Tetris applying reinforcement learning methods

In this paper the application of reinforcement learning to Tetris is investigated, particulary the idea of temporal difference learning is applied to estimate the state value function V . For two predefined reward functions Tetris agents have been trained by using a -greedy policy. In the numerical experiments it can be observed that the trained agents can outperform fixed policy agents signifi...

متن کامل

Least-Squares Methods in Reinforcement Learning for Control

Least-squares methods have been successfully used for prediction problems in the context of reinforcement learning, but little has been done in extending these methods to control problems. This paper presents an overview of our research efforts in using least-squares techniques for control. In our early attempts, we considered a direct extension of the Least-Squares Temporal Difference (LSTD) a...

متن کامل

Cross-Entropy Method for Reinforcement Learning

Reinforcement Learning methods have been succesfully applied to various optimalization problems. Scaling this up to real world sized problems has however been more of a problem. In this research we apply Reinforcement Learning to the game of Tetris which has a very large state space. We not only try to learn policies for Standard Tetris but try to learn parameterized policies for Generalized Te...

متن کامل

Learning Tetris Using the Noisy Cross-Entropy Method

The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009